Poorly Structured Handwritten Documents Segmentation using Continuous Probabilistic Feature Grammars

نویسنده

  • T. Artières
چکیده

This work deals with poorly structured handwritten documents segmentation such as pages of handwritten notes produced with pen-based interfaces. We propose to use a formalism, based on Probabilistic Feature Grammars, that exhibit some interesting features. It allows handling ambiguities and to taking into account contextual information such as spatial relations between objects in the page.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Content-based Information Retrieval from Handwritten Documents

This paper is about retrieving the closest matches from a set of scanned handwritten documents based on a query that is a document image. System indexing and retrieval is based on writer characteristics, textual content as well as document meta data such as writer profile. Documents are indexed using global image features, e.g., stroke width, slant, word gaps, as well local features that descri...

متن کامل

Indexing Real-World Data using Semi-Structured Documents

We address the problem of deriving meaningful semantic index information for a multi-media database using a semi-structured document model. We show how our framework, called feature grammars, can be used to (1) exploit third-party interpretation modules for real-world unstructured components, and (2) use context-free grammars to convert such poorly or unstructured input to semi-structured outpu...

متن کامل

Radial Line Fourier Descriptor for Segmentation-free Handwritten Word Spotting

Automatic recognition of historical handwritten manuscripts is a daunting task due to paper degradation over time. Recognition-free retrieval or word spotting is popularly used for information retrieval and digitization of the historical handwritten documents. However, the performance of word spotting algorithms depends heavily on feature detection and representation methods. Although there exi...

متن کامل

Holistic Approach for Classifying and Retrieving Personal Arabic Handwritten Documents

This paper presents a novel holistic technique for classifying and retrieving Arabic handwritten text documents. The retrieval of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several features are extracted from these connected parts and then co...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003